Slowing Down Top Trees for Better Worst-Case Bounds

نویسندگان

  • Bartlomiej Dudek
  • Pawel Gawrychowski
چکیده

We consider the top tree compression scheme introduced by Bille et al. [ICALP 2013] and construct an infinite family of trees on n nodes labeled from an alphabet of size σ, for which the size of the top DAG is Θ( n logσ n log logσ n). Our construction matches a previously known upper bound and exhibits a weakness of this scheme, as the information-theoretic lower bound is Ω( n logσ n ). This settles an open problem stated by Lohrey et al. [arXiv 2017], who designed a more involved version achieving the lower bound. We show that this can be also guaranteed by a very minor modification of the original scheme: informally, one only needs to ensure that different parts of the tree are not compressed too quickly. Arguably, our version is more uniform, and in particular, the compression procedure is oblivious to the value of σ.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Technical Report No. 2011-577 State Complexity of Star and Quotient Operation for Unranked Tree Automata

We consider the state complexity of extensions of the Kleene star and quotient operations to unranked tree languages. Due to the nature of the tree structure, there are two distinct ways to define the star operation for trees, we call these operations, respectively, bottom-up and top-down star. We show that (n+ 3 2 )2 states are sufficient and necessary in the worst case to recognize the bottom...

متن کامل

A Tight Lower Bound for Top-Down Skew Heaps

Previously, it was shown in a paper by Kaldewaij and Schoenmakers that for topdown skew heaps the amortized number of comparisons required for meld and delmin is upper bounded by logφ n, where n is the total size of the inputs to these operations and φ = ( √ 5+1)/2 denotes the golden ratio. In this paper we present worst-case sequences of operations on top-down skew heaps in which each applicat...

متن کامل

On domain-partitioning induction criteria: worst-case bounds for the worst-case based

One of the most popular induction scheme for supervised learning is also one of the oldest. It builds a classi3er in a top-down fashion, following the minimization of a so-called index criterion. While numerous papers have reported experiments on this scheme, little has been known on its theoretical aspect until recent works on decision trees and branching programs using a powerful classi3catio...

متن کامل

Tree Compression with Top Trees

Abstract. We introduce a new compression scheme for labeled trees based on top trees [3]. Our compression scheme is the first to simultaneously take advantage of internal repeats in the tree (as opposed to the classical DAG compression that only exploits rooted subtree repeats) while also supporting fast navigational queries directly on the compressed representation. We show that the new compre...

متن کامل

Efficient Implementation of Lazy Suffix Trees

We present an efficient implementation of a write-only topdown construction for suffix trees. Our implementation is based on a new, space-efficient representation of suffix trees which requires only 12 bytes per input character in the worst case, and 8.5 bytes per input character on average for a collection of files of different type. We show how to efficiently implement the lazy evaluation of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1801.01059  شماره 

صفحات  -

تاریخ انتشار 2018